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TOPOLOGY-GUIDED SAMPLING OF NONHOMOGENEOUS 
RANDOM PROCESSES 

By Konstantin Misohaikow 1 and Thomas Wanner 2 

Rutgers University and George Mason University 

Topological measurements are increasingly being accepted as an 
important tool for quantifying complex structures. In many appli- 
cations, these structures can be expressed as nodal domains of real- 
valued functions and are obtained only through experimental obser- 
vation or numerical simulations. In both cases, the data on which 
the topological measurements are based are derived via some form 
of finite sampling or discretization. In this paper, we present a prob- 
abilistic approach to quantifying the number of components of gen- 
eralized nodal domains of nonhomogeneous random processes on the 
real line via finite discretizations, that is, we consider excursion sets 
of a random process relative to a nonconstant deterministic threshold 
function. Our results furnish explicit probabilistic a priori bounds for 
the suitability of certain discretization sizes and also provide infor- 
mation for the choice of location of the sampling points in order to 
minimize the error probability. We illustrate our results for a variety 
of random processes, demonstrate how they can be used to sample 
the classical nodal domains of deterministic functions perturbed by 
additive noise and discuss their relation to the density of zeros. 

1. Introduction. The motivation for this work comes from our attempts 
to create novel metrics for quantifying, comparing and cataloging large sets 
of complicated varying geometric patterns. Random fields (for a general 
background, see [1, 4, 12, 19, 22], as well as the references therein) provide 
a framework in which to approach these problems and have, over the last 
few decades, emerged as an important tool for studying spatial phenomena 
which involve an element of randomness [1, 2, 24, 27, 29]. For the types 
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Fig. 1. Sample functions from a random sum of the form u(x,io) = X^jcLo 9k{^)<Pk(x) 
where g\ , . . . , grjv are independent standard Gaussian random variables. In the left diagram, 
we consider random periodic functions, that is, the basis functions ipu are given by if>2k {x) = 
cos(2nkx) and tp2k-i(x) = sin(27rfcz;), in the right diagram they are the Chebyshev polyno- 
mials (fk(x) = cos(fcarccos:r). In each case, we choose N = 16. 

of applications, we have in mind [13, 14, 21], we are often satisfied with 
a topological classification of sub- or super-level sets of a scalar function. 
Algebraic topology, and in particular homology, can be used in a computa- 
tionally efficient manner [18] to coarsely quantify these geometric properties. 
In past work [7, 23], we developed a probabilistic framework for assessing 
the correctness of homology computations for random fields via uniform 
discretizations. The approach considers the homology of nodal domains of 
random fields which are given by classical Fourier series in one and two space 
dimensions, and it provides explicit and sharp error bounds as a function 
of the discretization size and averaged Sobolev norms of the random field. 
While we do not claim it is trivial — there are complicated combinatorial 
questions that need to be resolved — we believe that it is possible to extend 
the methods and hence the results of [23] to higher-dimensional domains. 

The more serious restriction in [23] is the use of periodic random fields, 
which due to the fact that the associated spatial correlation function is 
homogeneous, simplifies many of the estimates. In general, however, one 
expects to encounter nonhomogeneous random fields. In such cases, it seems 
unreasonable to expect that uniform sampling provides the optimal choice. 
For example, in Figure 1, three sample functions each are shown for a random 
sum involving periodic basis functions and Chebyshev polynomials. As one 
would expect, the zeros of the random Chebyshev sum are more closely 
spaced at the boundary, and therefore small uniform discretization are most 
likely not optimal for determining the topology of the nodal domains. 

With this as motivation, we allow for a more general sampling technique. 
We remark that because of the subtlety of some of the necessary estimates 
we restrict our attention in this paper to one-dimensional domains. 
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Definition 1.1 (Nonuniform approximation of generalized nodal do- 
mains). Consider a compact interval [a, b] C R, a threshold function \i : [a, b] - 
R, and a function u : [a, b] — > R. Then we define the generalized nodal domains 
of u by 

(1) N± = {xe[a,b}:±(u(x)-Li(x))>0}, 

which for the case of n(x) = reduces to the classical definition of a nodal 
domain in [5]. An M-discretization of [a,b] is a collection of M + 1 grid 
points 

a = xq < x\ < • ■ • < xm = b, 

and we define xm+i = x m = b in the following. The cubical approxima- 
tions Q^j of the generalized nodal domains Nj^ of u are defined as the 
sets 

Q t,M : = [j{[ x k,x k +i] ■ ±((« - aO(x*)) > 0, k = 0, . . . , M}. 

Given a subset X C [a, b], let f3o(X) denote the number of components 
of X. Consider a random field u: [a, b] x — > R over the probability space 
(Q, J 7 , P). We are interested in optimally characterizing the topology, that is, 
determining the number of components, of the nodal domains Nj^ in terms 
of the cubical approximations In other words, our goal is to choose 

the M-discretization of [a, b] in such a way as to optimize 

P{/3 (iV±) = /3o(Qj )M )}. 

We provide two results addressing this question. The first characterizes the 
choice of the sampling points a = xq < x\ < • • • < xm = b under reasonably 
general abstract conditions. More precisely, consider the following assump- 
tions: 

(Al) For every x G [a, b], we have F{u(x) = (jl(x)} = 0. 

(A2) The random field is such that P{u — \i has a double zero in [a, b]} = 0. 
(A3) For a £ {±1}, x G [a, b] and 5 > with x + 5 £ [a, b] define 

p a (x, S) = pi au(x) > o~n(x),o~u( X + — J < 0~jJ,( X + 



2J - r \ 2 

au(x + 5) > afi(x + 5) 

Then there exists a continuously differentiable function Cq : [a, b] — > R + 
as well as a constant C\ > such that for all x G [a, 6] with cc + S G [a, 6] 
we have 

P+1 (x,S) +p- 1 (x,5)<C (x) ■ 5 3 + d ■ 5 A . 
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In Section 3, we prove the following result. 

Theorem 1.2 (Sampling based on local probabilities). Consider a prob- 
ability space (OjJ 7 , P), a continuous threshold function fj,:[a,b] — > M, and 
a random field u:[a,b] x $7 — > M over (f2, J 7 , P) such that for T-almost all 
we!! the function u(-, u) : [a, b] — > M is continuous. Choose the sampling 
points a = xq < ■ • ■ < xm = b such that 

f k yc (x)dx = ^-- [ t/Co{x)dx for allk = l,...,M, 

Jx k _ 1 M Ja 

and consider the generalized nodal domains N^(lo) and their approxima- 
tions Q^m^) as ^ n Definition 1.1. // assumptions (Al), (A2) and (A3) 
hold, then 

(2) nmt) = m% M )} > 1 - • (jf Vc&jdx^j 3 +o(J^j . 

This theorem is a direct generalization of the corresponding result in ([23], 
Theorem 1.3). Numerical computations presented in Section 2 suggest that 
for certain nonhomogeneous random fields this estimate is sharp — and in 
fact an enormous improvement over the homogeneous result where Cq(x) is 
replaced by max xg (;Co(x). 

Of course in practice one is interested in applying Theorem 1.2 to spe- 
cific random fields. This requires the verification of assumptions (Al), (A2) 
and (A3), preferably in terms of central random field characteristics. 

Definition 1.3. For a random field u:[a,b] x ft — > R over a probability 
space (Q,J-,¥), we define its spatial correlation function R: [a,b] 2 — >W as 

R(x, y) = K((u(x) — Ku(x))(u(y) — Eu(y))) for all x, y G [a, b], 

where E denotes the expected value of a random variable over (f2, T, P). 

If the random field is sufficiently smooth, then the derivatives of the spa- 
tial correlation function, 

Qk+£ R 

( 3 ) R kA x ) = -Q^i(x,x), 

have a natural interpretation in terms of spatial derivatives of the random 
field u. Since 

R k/ (x) = E((uW (x) - EuW (x)) (uW (y) - Eu® (y))), 

the function Rk,k contains averaged information on the square of the A;th 
derivative of the random function u, more precisely, its variance. 
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To relate the spatial correlation function to the function Co in Theo- 
rem 1.2, we specialize to Gaussian random fields. To be more precise, we 
make the following assumptions. 

(Gl) Consider a Gaussian random field M:[o,i)]xQ->R over a probability 
space (fl, J-, P) such that u(-,cj) : [a,b] — > M is twice continuously dif- 
ferentiate for P-almost all u> G fL Furthermore, assume that for every 
x G [a, b] the expected value of u(x) satisfies 

Eu(x) = 0. 

(G2) The spatial correlation function R is three times continuously differ- 
entiable in a neighborhood of the diagonal x = y and the matrix 

/ -R ,oO) Ri,o(x) R 2 ,o(x) 
(4) K(x) = \Ri, (x) R ltl (x) R 2 ,i(x) 

\R 2 , (x) R 2jl (x) R 2>2 (x) 

is positive definite for all x G [a, b]. 
We make considerable use of 1Z, and thus introduce the following notation 







■= Ro,oRi,i 




(5) 




■= Rq,qR2,\ 


— RiflR 2 fl 






'■= RifiR 2 ,\ 


— Ri,iR 2 ,o 



These expressions are just the determinants of minors of 1Z. This allows us 
to state the following theorem. 

Theorem 1.4 (Sampling based on spatial correlation). Consider a Gaus- 
sian random field u : [a, b] x f2 — >• R satisfying (Gl) and (G2), and a threshold 
function \x : [a, b] — > M. of class C 3 . Choose the sampling points a = xo < ■ ■ ■ < 
xm = b in such a way that 

/ ^/C{x)dx = — - ^/C{x)dx for all k = l,...,M, 



where 



given 



(n^xMx)-n% 2 (x)v'(x)+Tl% 3 (x)Li"(x)) 2 

[x) nf 3 (x)detn(x) 

_ (R 1>0 (x)rtx) - R Qfl {x)^(x)f+lZl 3 {x)^x) 2 
[X) 2R 0>Q (x)U^(x) 
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Let Q^m^) denote the cubical approximations of the random generalized 
nodal domains N^(u)) ofu(-,co). Then 

(7) F{MN±) = MQt, M )} > 1 - • (jf Vcfrjdx) 3 + O (J^ . 

The proof of Theorem 1.4 is presented in Section 5. However, it depends 
on nontrivial results concerning the asymptotic behavior of sign-distribution 
probabilities of parameter-dependent Gaussian random variables. These re- 
sults are developed in Section 4. 

The number of nodal domains fio{N^) is clearly dependent upon the zeros 
of u — fi. Thus, it is reasonable to expect that there is some relationship 
between the function C derived in Theorem 1.4 and the density of the zeros 
of the random field u. The first step is to obtain a density function. For this, 
a weaker form of (G2) is sufficient. 

(G3) Assume that the spatial correlation function R is two times continu- 
ously differentiable in a neighborhood of the diagonal x = y and that 
R(x, x) > for all x € [a, b\. 

Finding the density of the zeros of random fields has been studied in a 
variety of settings, see, for example, [2, 4, 6, 11, 12], as well as the references 
therein. The following theorem can be found in [6], (13.2.1), page 285. 

Theorem 1.5 (Density of zeros of a random field). Consider a Gaussian 
random field u:\a,b\ x f2 — > R satisfying (Gl) and (G3). Then the density 
function for the number of zeros of u is given by 

(8) v{x) = s -y L . 

In other words, for every interval I C [a, b] the expected number of zeros of u 
in I is given by fjT>(x)dx. 

While Theorem 1.5 has been known for quite some time, its implications 
are surprising. As is demonstrated through examples in Section 2 there is 
no simple discernible relationship between the function C 1 / 3 of Theorem 1.4 
and the density function T>. 

As is made clear at the beginning of this Introduction, our motivation 
is to develop optimal sampling methods for the analysis of complicated 
time-dependent patterns. Thus, before turning to the proofs of the above- 
mentioned results, we begin, in Section 2, with demonstrations of possible 
applications and implications of Theorem 1.4. In particular, we consider 
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several random generalized Fourier series u : [a, b] x 0, — > M denned by 

(9) u(x,uj) = ^2g k (uj) ■ tp k (x), 

k=0 

where (p k '■ [a, b] — > ]R, k € No, denotes a family of smooth functions and we 
assume that the Gaussian random variables g k : fi — > M, k £ No, are defined 
over a common probability space (f2, J-, P) with mean 0. 

We conclude the paper with a general discussion of future work concerning 
natural generalizations to higher dimensions. 

2. Sampling of specific random sums. To demonstrate the applicability 
and implications of Theorem 1.4, we consider in this section several ran- 
dom generalized Fourier series u: [a, b) x O 1 of the form in (9). As men- 
tioned before, the functions ip k : [a, b] — > R, k £ No, denote a family of smooth 
functions and we assume that the random variables g k :£l —> M, fc £ No, 
are Gaussian with vanishing mean, and defined over a common probabil- 
ity space (fi, J 7 , P). We would like to point out that these random variables 
do not need to be independent, and we define 

a kjTn = E(g k g m ) for all k, m G N . 

Then one can easily show that 

oo 

R k/ {x)=nu {k) {x)u^{x)) = a id ^ k \x)ipf\x). 

i,j=0 

If in addition the random variables g k are pairwise independent, then we 
have 

oo 

RkA x ) = Yl a 3^T 0*0, 

3=0 

where aj j > for all j € No- One can show that this diagonalization can al- 
ways be achieved for Gaussian random fields, provided the basis functions ip k 
are chosen appropriately. For more details, we refer the reader to [2], Theo- 
rems 3.1.1 and 3.1.2, Lemma 3.1.4. 

Within the above framework of random generalized Fourier series, we 
specifically consider several classes: 

• Random Chebyshev polynomials u : [— 1, 1] X CI — >■ R of the form 

N 

(10) u(x, uj) = 9k M • cos(fc arccos x) with E(g k gi) = S k ,l- 

k=0 
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• Random cosine series u : [0, 1] x — > M of the form 

N 

(11) u(x,u) =^^g k (uj) -cos(kTTx) with E(g k ge) = 6k,t- 

k=0 

• Random L -periodic functions u : K x Q — > M. of the form 

/ 2nkx 2irkx\ 
u{x,w) = 2_^ a k • ( 32fcM -cos— h52fc-i(w) ■sin— I 

k=o ^ ' 

(12) 

with E(g k g e ) = <5 M , 

with real constants a k . 

• Random polynomials u : [—3, 3] x Q — > R u>ii/i Gaussian coefficients of bi- 
nomial variance of the form 

(13) u(x,w) = ^5 fc (w) -sc* with E(g k g £ ) = <5 M ■ f k j , 

k=o ^ ' 

• Random polynomials u : [—3, 3] x 0, — > M. with Gaussian coefficients of unit 
variance of the form 



N 

(14) u(x,u) = ^2g k (u) -x k with E(g k g t ) = S kt£ . 

k=0 

As is indicated in Section 1, we assume that all the random coefficients are 
centered Gaussian random variables over a common probability space (Cl,J-,¥ 

2.1. The case of vanishing threshold function. We begin our applications 
by thresholding sample random sums at their expected value, that is, we 
use the threshold function \i = 0. In this particular case, the function C(x) 
defined by (6) in Theorem 1.4 simplifies to 

(15) C(x)- detn{x) 



48^7^ (x) 3 /2 ! 



since both A{x) and B{x) vanish. 

For the case of random Chebyshev polynomials (10), the left diagram in 
Figure 2 shows three normalized sample functions 

j\C 1 /^{x)dx 



for N = 3, 5, 10. The right diagram shows the expected number of zeros of 
the random Chebyshev polynomials as a function of N (red curve), which 
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Fig. 2. Topology- guided sampling of random Chebyshev polynomials (10). The left dia- 
gram shows the functions C 1 / 3 for N = 3,5, 10 (red, blue and green, respectively — increasing 
values of N increase the number of extrema); for comparison reasons, each curve has been 
scaled in such a way that the area under the graph is one. The right diagram shows the 
expected number of zeros of the random Chebyshev polynomials as a function of N (bottom 
red curve), the value of M for which Theorem 1.4 gives a correctness probability of 95% 
(middle blue curve), and the value of M for which [23] gives a correctness probability 
of 95% (top green curve) with Co = max Co (x) . 

grows proportional to N. Thus, in order to sample the random field suffi- 
ciently fine, we expect to use significantly more than O(N) discretization 
points. The blue curve in the right diagram of Figure 2 shows the values 
of M for which the bound in (7) of Theorem 1.4 implies a correctness prob- 
ability of 95%, and a least squares fit of this curve furnishes M ~ iV 3 / 2 . For 
comparison, the green curve in the same diagram shows the values of M for 
which the bound in our previous result ([23], Theorem 1.4) implies a cor- 
rectness probability of 95%, provided we apply this theorem with Co given 
as the max^j.j !] Cq{x). Notice that in this case we have M~ iV 3 . In other 
words, only the topology-guided sampling result of the current paper yields 
a reasonable growth for the number of sampling points. In fact, based on 
our results for periodic random fields in [23] and the numerical simulations 
in [7], we expect that M~ N 3 ^ 2 is the optimal discretization size. 

For the case of random cosine sums (11), that is, random trigonometric 
sums satisfying homogeneous Neumann boundary conditions, the analogue 
of the right diagram in Figure 2 is depicted in the left diagram of Figure 3. 
Notice that for the random cosine sums the expected number of zeros is 
proportional to N, and the required number of sampling points has to be 
proportional to iV 3 / 2 for both Theorem 1.4 and [23], Theorem 1.4. In other 
words, in this situation the gains from topology-guided sampling are no 
longer as large as in the context of Chebyshev polynomials. Also in this 
case, the curves for M are obtained in such a way that the right-hand side 
in (7) or the corresponding bound in [23] equals 95% 
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Similar behavior can be seen in the case of random polynomials (13) 
with Gaussian coefficients of binomial variance; see the right diagram of 
Figure 3. For the random algebraic polynomials (13), one can show that the 
expected number of zeros is proportional to iV 1 / 2 , and the required number 
of sampling points implied by (7) or [23] has to be proportional to iV 3//4 for 
both results. In fact, the function C can be computed explicitly in this case. 
Due to (13), the spatial correlation function R is given by 



R(x,y)=Y^ [ )x k y k = (l + xy) N , 
k=o ^ ' 



which after some elementary computations furnishes 

7V x / 2 (iV-l) 



(16) 



C(x) 



24vr(l + x 2 ) 3 ' 

As for the case of random polynomials with Gaussian coefficients of unit vari- 
ance, a classical result due to Kac [16, 17] implies that the expected number 
of zeros is proportional to logiV. In this case, Theorem 1.4 implies that the 
required number of sampling points has to be proportional to (logiV) 3 / 2 . 

2.2. The case of constant threshold function. We now turn our attention 
to a constant threshold function fi(x) = r, for some real number r. In this 
case, the function C(x) in Theorem 1.4 simplifies to 

detK(x) 



(17) 



C(x) 



48ttTZ^ 3 (x) 3 / 2 



S(x), 





FlG. 3. Topology-guided sampling of random trigonometric polynomials (11) satisfying 
Neumann boundary conditions (left diagram) and random algebraic polynomials (13) with 
binomial variances (right diagram). The curves show the expected numbers of zeros (bottom 
red curve), the discretization size required by Theorem 1.4 to achieve 95% correctness 
(middle blue curve), and the discretization size required by [23] for a correctness probability 
of 95% (top green curve), with Co — max Co (a;) . 
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Fig. 4. Effect of varying the threshold r on the function C(x) in (17) for random 
Chebyshev polynomials (10) with N = 5. The left diagram shows the function C(x) for 
t = 0, 1, 2, 3, 4, 5 (black, green, cyan, red, magenta, blue), the right diagram shows only the 
function S(x) defined in (18). 



where 

(18) S{x) = 1 + — • exp - i£ ' -jfe ■ t 



7^ 3 (x)detW(x)y V 2R ,o(x)TZ^ 3 (x 

For large values of |r|, the scaling function S(x) will be close to zero, 
and it therefore effectively decreases the probability for mistakes in the ho- 
mology computation. In fact, it decreases exponentially fast with respect 
to |r|. However, as is shown in Figure 4 for the random Chebyshev polyno- 
mials (10), for values of r close to zero, there can be regions in which the 
probability for mistakes actually increases. This behavior is even more pro- 
nounced in the case of random algebraic polynomials (13) and (14), which 
is shown in Figure 5. 

2.3. The case of varying threshold function. We now consider the case 
of a general threshold function under the following assumptions. Suppose 
a deterministic function n(x) is perturbed by a centered Gaussian random 
field u(x,u), and that we are interested in determining the classical nodal 
domains of the sum 

v(x, u) = fi(x) + u(x,u). 

Sampling v(x,u) at the threshold zero is obviously equivalent to sampling 
u(x,co) at the threshold —fj,(x). Thus, we can use Theorem 1.4 to find the 
optimal location of the sampling points using the function C(x) defined 
in (6). 
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Fig. 5. Effect of varying the threshold t on the function C(x) in (17) for random algebraic 
polynomials (13) (top row) and (14) (bottom row) with N = 5. In each row, the left diagram 
shows C(x) for r = 0, 1,2, 3,4, 5 (black, green, cyan, red, magenta, blue), and the right 
diagram shows only the function S(x) defined in (18). 



In order to demonstrate the effects of the varying threshold function —fi(x) 
more clearly, we now assume that the perturbing random field u is homo- 
geneous, that is, we assume that u is a random L-periodic function of the 
form (12). Furthermore, we assume that the real scaling factors a& in (12) 
satisfy 



fc=0 



k 6 a1 < oo, 



and that at least two of the do not vanish. It was shown in [23] that in 
this case the spatial correlation function R is given by 

/ \ / \ / \ \ — ^ 2 27r/c(x — y*) 
R{x,y) =Eu(x)u(y) = ^a k -cos . 

k=0 
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From this, one can readily see that the matrix function lZ(x) defined in (4) 
is constant and given by 



U{x) 



A 


4vr 2 A 



L 2 





4vr 2 ^i 
— V~ 





4ir 2 Ai \ 
I 2 "" 



16tt 4 ^ 2 



where 



-4/ 



fc=o 

Thus, the function C(x) in (6) is now given as 

7T 2 yl ^2 
6L 3 



(19) 



A\ 



3/2 .1/2 



^0 A 



S{x) 



where 



Six) 



1 + 



x exp 



(A lf ,(x) + A ^"(x)-(L 2 /(47T 2 ))f 
A (A A 2 -A 2 ) 

A W (x) 2 + A ^(x) 2 .(L 2 /(4tt 2 )) 



2A A 1 



Notice that the exponential factor is bounded above by exp(— /j,(x) 2 /(2Aq)), 
that is, large function values of fJ>(x) lead to small failure probabilities. 

We close this subsection by visualizing the function C(x) defined in (19) 
for the deterministic function n(x) = x — x 3 + r and r-values between 
and 3. The specific functions fi(x) are shown in the left image of Figure 6. 
In the right image, the corresponding functions C{x) are shown, where u is 
defined as in (12) with = for k = and k > N, as well as = N~ 1 / 2 for 
k = 1, . . . ,N. This implies that the variance of u(x) equals 1. In Figure 6, 
we use N = 5. 



2.4. Comparison with density- guided sampling. In order to illustrate the 
differences between the density of zeros T> derived in Theorem 1.5 and the 
function C 1//3 from Theorem 1.4, we return to our examples from the last 
section. For each of these examples, Figure 7 depicts both 

C 1/3 (x) i 
JC 1 / 3 {x)dx jV{x)dx 

for the case N = 5. It is evident from these graphs that in most cases, 
the homology-based sampling density is different from the actual density of 
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-3-2-10123 -3-2-10123 



x x 

Fig. 6. Sampling of deterministic functions fi(x) perturbed by homogeneous random 
noise. The left image shows the functions fJ-(x) = x — x 3 + r for r = 0,0.5, 1,2,3 (green, 
cyan, red, magenta, blue), the right images shows the corresponding functions C(x) defined 
in (19). 



zeros. In fact, in many cases it behaves anticyclic to T> in the sense that the 
local extrema of C 1//3 alternate with the local extrema of T>. 

There are, however, exceptions, as the case of the random algebraic poly- 
nomial (13) demonstrates. In this case, it follows from Theorem 1.5 that 

ATl/2 

V(x) 



7T(1 + X 2 )' 



and together with (16) this shows that the normalized C 1 / 3 - and P-functions 
coincide. 



3. Sampling based on local probabilities. The goal of this section is the 
proof of Theorem 1.2, which is a generalization of [23], Theorem 1.3. Thus, 
we begin by recalling some basic definitions and results. 

As is indicated in Section 1, given a continuous function u : [a, b] — > K and a 
continuous threshold /i : [a, b] — > M we are interested in determining the num- 
ber of components of the generalized nodal domain Njf in terms of a cubical 
approximation M obtained via sampling at M + 1 points as described in 
Definition 1.1. For suitably chosen discretization points, and under appropri- 
ate regularity and nondegeneracy conditions on u one can then expect that 
the number of components of anc ^ a S ree - O ne only has to be able 

to verify that the function u has at most one zero (counting multiplicity) in 
each of the intervals [xk-i,Xk\, for k = 1, . . . ,M. This is accomplished using 
the following framework which goes back to Dunnage [10]. 
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Fig. 7. A comparison of the function C 1 ' 3 and the density function T> for random Cheby- 
shev polynomials (10) (top left diagram), random trigonometric polynomials (11) (top 
right), random algebraic polynomials (13) (bottom left), and random algebraic polynomi- 
als (14) (bottom right). In all cases, the areas under the graphs have been normalized to 
one, and we chose N = 5. 

Definition 3.1. A continuous function u : [a, b] — >M has a double crossover 
on the interval [a,/3] C [a,b], if 

(20) a-u(a)>0, a-u^^^<0 and a ■ u(J3) > 

for one choice of the sign a € {±1}- 

Definition 3.2. Let u : [a, b] — > R be a continuous function. 

• The dyadic points in the interval [a, (3] are defined as 

k 

d nk = a + (/3 - a) for all k = 0, . . . , 2 n and n G N . 

2™ 

The dyadic subintervals of [a,/3] are the intervals [d n k,d n k+i] for all k = 
0,...,2 n - 1 and ne% 
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• The interval [a, 0\ C [a, b] is admissible for u, if the function u does not 
have a double crossover on any of the dyadic subintervals of 

It was shown in [23] that the concept of admissibility implies the suitabil- 
ity of our nodal domain approximations. More precisely, the following is a 
slight rewording of [23], Proposition 2.5. 

Proposition 3.3 (Validation criterion). Let u: [a, b] — >R be a continu- 
ous function and let fj, : [a, b] — >■ R be a continuous threshold function. Let Nj^ 
denote the generalized nodal domains of u, and let denote their cubical 

approximations as in Definition 1.1. Furthermore, assume that the following 
hold: 

(a) The function u — \i is nonzero at all grid points x^, for k = 0, . . . , M . 

(b) The function u — fi has no double zero in (a,b), that is, if x G (a, b) is a 
zero of u, then u — fi attains both positive and negative function values 
in every neighborhood of x. 

(c) For every k = 1, . . . , M , the interval [xk-i,Xk] between consecutive dis- 
cretization points is admissible for u — fj, in the sense of Definition 3.2. 

Then we have 

f3 (N±) = f3 (Q% M ). 

The following lemma provides bounds on the probability for admissibility 
of a given interval. 

Lemma 3.4. Consider a probability space (Q,J-,¥), a continuous thresh- 
old function fi : [a, b] — > M., and a random field u : [a, b] X — > M over (fi, J 7 , P) 
such that u(-,uj) is continuous for F-almost all u) &Q. In addition, assume 
that (Al), (A2) and (A3) hold. If[x,x + 5] C [a,b], then 

P([x, x + S] is not admissible for u — fi) 

<21> <f*w. 4 . + f*t + «.y J . 

"3 V 3 7 / 

where L = max{|Cg(y)| : y € [a, b]}. 

Proof. If the interval / = [x,x + S] is not admissible, then the func- 
tion u — [i has a double crossover on one of its dyadic subintervals. If we 
now denote the dyadic points in / by d n ^ as in Definition 3.2, then together 
with (A3) one obtains the estimate 

oo 2™-l 

¥{I is not admissible} <J2J2 (P+i(d n ,k,S/2 n ) +p_ 1 (d n , fc , 5/2 n )) 

n=0 k=0 
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oo 2™-l , / r \ 3 ' <^ 4 



<EE( c oK,,)-(^) 

n=0 fc=0 V V y 



1 ' 



2' 



Since Co is continuously differentiable, we can define L = max{|C (y)| :y G 
[a, 6]}, and the definition of the dyadic points implies 

Co(d n ,fc) < C (2;) + L • (d n>fe - x) < C (x) + L<5. 
This finally furnishes 

¥{I is not admissible} < I (C (x) + L5) • I — ) + d • f — 

n=0 fc=0 ^ V / V 

4Co(^) ,3 , (*L 8CA 4 
" 3 + V 3 7 y □ 

Combining Proposition 3.3, Lemma 3.4, and restricting to the leading 
order term in (21), one obtains 

4 M 

(22) F{P (N±) = /3o(QJ !M )} > 1 " 3 • E C °( Xfc -!) • ( Xk ~ Xk ~^- 

k=l 

Clearly, the resulting bound depends on the location of the sampling points, 
which suggests maximizing the bound to optimize the location. 

We first provide a heuristic argument for this optimal location, and present 
the precise result afterwards. One can show that for arbitrary nonnegative 
numbers 5±, ... ,5m > the inequality 

M ( M \ 3 

(5>) 

k=l \k=l / 

holds, with equality if and only if <5i = 62 = ■ ■ ■ = 5m- Applying this inequality 
to the sum in the right-hand side of (22), implies 

M I ( M \ 3 

(23) ^C (x fc _i) • (x k - Xfc-l) 3 > Jj2 ' ( E V c o( x k-i) ■ ( x k ~ x k -i) J 
fc=i Vfe=i / 

with equality if and only if 



y/C (x k -i) • {xk ~ x k -i) = \/Co(x£_i) • (x e - xt-x) 

(24) 

for all k,£ = l,...,M. 

For large M, the sum on the right-hand side of (23) converges to the inte- 
1/3 

gral of C over [a, 6]. The motivation for Theorem 1.2 is now clear: Con- 
dition (24) suggests that for M — > 00, the optimal estimate can be achieved 
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by choosing the sampling points in an equi-C -area fashion, since the term 

Co{%k-i) l ^ 3 ( x k — Xk-i) approximates the intergral of C^ 3 over [x k _i,x k ]. 
This heuristic forms the basis for the following proof of our first main result. 



Proof of Theorem 1.2. Let <5 max := maxfc =lv m \ x k ~ x k-i\> and de- 
fine the positive number m := min^g^ y ^(a;) 1 ^ 3 > 0. Furthermore, let L := 
max^guy \dC^ 3 /dx\. Then the mean value theorem readily furnishes 



(25) 



\/Co(x fc _i) • (x k - / y/C (x) dx 



< L(x k - x k . 



*2 



for all k = 1,. . . , M. Due to the choice of the sampling points we further 
have 

r^k i rb 

(26) m- (x k - x k -i) < I y/C (x)dx = — ■ / ^/C (x)dx, 

Jii._ i Ja 

s v ' 

=:K 

which in turn implies 

K 

(27) < x k - < <5 max < — : for all k = 1, . . . , M. 

m ■ M 

Applying Lemma 3.4 to every subinterval formed by adjacent sampling 
points, we now obtain together with (25), (26) and (27) the estimate 

l-P{/3 (A^) = /3 (Qj M )} 

^ M M 

< -^C (x fc _i) • (x k - x k _if + C 2 ^(x fc - x fc _i) 4 

k=l k=l 



4 ^(K . . 2 V C 2 K 4 



M / TS \ 3 

k=l 

~ 3'^\M + m 2 M 2 ) + m 4 M 3 
fc=i v 7 

4A ' S + ofi 



m 4 M 3 



3M 2 V M3 , 
for some constant C2 > 0. This is exactly (2). □ 

4. Asymptotics of sign-change probabilities. Theorem 1.4 can be viewed 
as a special case of Theorem 1.2. The content lies in the fact that un- 
der the assumption of a Gaussian random field, the function Cq can be 
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explicitly computed. However, this requires a quantitative understanding 
of the asymptotic behavior of sign-distribution probabilities of parameter- 
dependent Gaussian random variables, which is the focus of this section. 

More precisely, let T(S) = (Ti(5), . . . , T n (<5))* G W 1 denote a one-parameter 
family of M n -valued random Gaussian variables over a probability space 
(O, .F,P), indexed by 5 > 0, and choose a sign sequence (s±, . . . ,s n ) G {±l} n . 
Furthermore, let t(5) G M 3 denote an arbitrary threshold vector. We are 
interested in the precise asymptotic behavior as 5 —> of the probability 

(28) P(S) = P{aj(Tj{S) - Tj{5)) > for all j = l,...,n}. 

The following result is an extension of ([23], Proposition 4.1) which dealt 
only with the special case r = 0. 

Proposition 4.1. Letts i, . . . , s n ) G {il} n denote a fixed sign sequence, 
and consider one-parameter families of a threshold vector t(S) G M 3 and an 
W l -valued random Gaussian variable T(5) over a probability space (fi, J 7 , P), 
for 5 > 0. Assume that the following hold: 

(i) For each 5 > 0, assume that the Gaussian random variable T(S) has 
mean G M. n and a positive definite covariance matrix C(5) G W ixn , 
whose positive eigenvalues are given by < Ai(<5) < • • • < X n (S). The cor- 
responding orthonormalized eigenvectors are denoted by vi(6), . . . ,v n (5). 

(ii) There exists a vector v\ = (vu, ■ ■ ■ G W 1 such that vi(5) — > v\ as 
(5—7-0, and Sj ■ v\j > for all j = 1, . . . , n. 

(hi) The quotient \\{S)/\k{S) converges to as 5 — > 0, for all k = 2, . . . ,n. 
(iv) There exists a vector a = («i, . . . , a n )* G W 1 such that 

(29) lim T(<5) =a k for all k = 1, . . . ,n. 
Furthermore, for a as above define 

O /*oo 

(30) Sa = ~7z . e -U=2<*V>. / ( s - ai f-V s2 /2 ds . 
V ^ 2™/2.r(n/2) y Ql V l} 

Then the probability P(8) defined in (28) satisfies 



(31) IimP(J) 



fdetC(tf) r(n/2)-5 a 



5->o w U Ai(5) n 2-W 2 -(n-l)! 



For specific values of n, the integral in (29) can be simplified further. 
For our one-dimensional application, we need the case n = 3, which is the 
subject of the following remark. 
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Remark 4.2. Recall that T(l/2) = vr 1 / 2 , T(l) = 1, and T(t + 1) = tT(t) 
for t > 0. Furthermore, notice that S a = 1 for a = G M n . In addition, for 
n = 3 one can readily verify that 

r>l/2 

(32) 



-aie 



Proof. Define the diagonal matrix S = (sj<%)i,j=i,...,n, where <5jj de- 
notes the Kronecker delta, and let Z + = {z € R n : Zj > for j = 1, . . . ,n}. 
Finally, let 

£>(£) = Ai(cJ) ■SC(S)- 1 S 

and 

Using the density of the Gaussian distribution of T(5) according to ([3], 
Theorem 30.4), which exists since C(5) is positive definite, in combination 
with a simple rescaling and shifting of the coordinate system, the probability 
in (28) can be rewritten as 

m = ( 2VT)-"/ 2 f e -*SCW-l S z/2 dz 

ydetC(<5) Jst{8)+z+ 



-J [ e -(z+d{8)YD{&){z+d(5))/2 i 

\l2^n detC (S) J z+ 6 

According to our assumptions, the eigenvalues fJ>i(5), . . . , fJ, n (6) of the ma- 
trix D(5) are given by 

Ai (5) 

Hi(6) = l and /j, k (5) = - for k = 2,...,n, 

Afc(o) 

with corresponding orthonormalized eigenvectors Wk(d~) = Svk(5), for k = 1, 
. . . ,n. Now let -B(<5) denote the orthogonal matrix with columns u>i(5), . . . , 
w n (5) and introduce the change of variables 2 = -B(<5)£. Moreover, let 

Z((i,5) = | (C 2 , • ■ • , Cn) = £ Rn_1 

define real numbers 771(5), ... , 77^(0") by 

77 fe (5) = St(5) -w k (S) =r(<5) -u fe (<5) for A; = 1, . . . ,n, 
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and let 



d((2, •••,Cn)- 



Due to (ii) and the definition of the signs Sk, the eigenvector has 
strictly positive components for all sufficiently small 5 > 0, and therefore 
the identity 



(z + £*(<$))*£>($)(* + d(*)) = Yl Vk(5) I Ck + 



k=l 



Ai(*)V2 



implies 



-(z+d(5))*D(5)( 2 +d(5))/2 ^ 



(33) 



exp 



E 



?7fc(g) 

Ai(<5)V2 



dC 



lB(6)-iZ + 

fOO 

/(Ci, 

From the definition of I ((,1,5), one can easily deduce 

'<^=W™-(-£^(<* + *)>--"-«- 

where we define £i = 1. This representation furnishes for all £i > and 5 > 
the estimate 



(34) 



/(&,*) < Cr 1 • vol n _i(Z(l,*)) • e -(fi+n(*)^(*)- 1/a ) a /2. 



Again according to (ii), the (n — l)-dimensional volume of the simplex Z{1, S) 
converges to the (n — 1) -dimensional volume of the simplex 

Z = {zeZ + :(z-Svi, Svt) =0}CR", 
which can be computed as 

1 



vol n _i(Z) 



(n-1)! 



i=i 



Now let > be arbitrary, but fixed. Notice that since we did not make any 
assumptions about the asymptotic behavior of the eigenvectors 102 (8) , ■ ■ ■ , w n (S) 
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for 5—7-0, the sets Z(l,5) do not have to converge. Yet, (ii) yields the exis- 
tence of a compact subset K C M n_1 such that Z(l,5) C K for all sufficiently 
small 5 > 0. Furthermore, we have 



Vk(S) V 2 2Citt(g) -7iffl 2 A ggAiffl 
X 1 (5)^J ^ + Ax(5)V2 + Al (5) + ^ A fc (*) 

n 
k=2 

as 5 —> 0. Due to (iii) and (iv), this convergence is uniform on K. Therefore, 
we have 

lim 5) = Cr 1 • vol n _i(Z) • e -(?i+«i) 2 / 2 • e -(«l+-+a 2 J/2 

for all Ci > 0. 

Due to (34) and vol n _i(Z(l, <5)) — > vol n _i(Z), we can now apply the dom- 
inated convergence theorem to pass to the limit 5 — > in (33), and this 
furnishes 

lim / e -(z+d(S)yD(5)(z+d(S))/2 dz 



Z+ 

poo 

VOl n _!(Z) • e -(°l+-+« 2 «)/2 . / C n-l e -(Cl+« 1 ) 2 /2 d(l 

JO 
poo 

VOln-iCZ) • e -(-l+-+^)/2 . / ( a _ ^-1^/2 ds 



□ 



We close this section with a corollary to Proposition 4.1. In our ap- 
plications of the above result, we are not only interested in the asymp- 
totic behavior of P(S) as defined in (28), that is, for the fixed sign se- 
quence (s±, . . . , s n ), but also in the corresponding probability for the negative 
sign sequence (— s\, . . . , — s n ). 

More precisely, if T(5) = (Ti(5), . . . ,T n (<5))* G R n denotes again a one- 
parameter family of M n -valued random Gaussian variables over a probabil- 
ity space (CI, J 7 , P), indexed by 5 > 0, and if we choose both a sign sequence 
(s\, . . . , s n ) G {±l} n and a one-parameter family r(5) G M n of threshold vec- 
tors, then we are interested in the asymptotic behavior as 5 — > of the 
probability 

P ± (S) = F{s j (T j (6) - Tj(5)) > for all j = I, . . . , ?i} 

(35) 

+ F{s j (T j (5) - Tj(S)) < for all j = 1, . . . , n}. 
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This is the subject of the following corollary. 

Corollary 4.3. Let (s\, . . . , s n ) £ {±l} n denote a fixed sign sequence, 
let t{5) £ M n denote a threshold vector, and consider a one-parameter fam- 
ily T(5), 6 > 0, of M. n -valued random Gaussian variables over a probability 
space (0,J-",P) which satisfies all the assumptions of Proposition 4.1. Then 
the probability F (5) defined in (35) satisfies 

I det C (5) r(n/2) • S± 



(36) limP*^)- 



5^0 w 1/ Ai(<5) n 2-W 2 -(n-l)! 



3=1 



where = S a + S- a , with a as in (29) and as in (30). Moreover, for 
the special case n = 3 one obtains 

(37) 5± = 2e-( a 2 +a i)/ 2 -(l + af). 

Proof. One only has to apply Proposition 4.1 twice — first with the 
given sign vector (s\, . . . , s n ), and then with the sign vector (— s\, . . . , — s n ). 
Notice that in the latter case, we have to use the eigenvector —vi(5) instead 
of vi(5), which leads to — instead of in (29); everything else remains 
unchanged. This immediately implies (36). As for (37), one only has to notice 
that 

poo poo poo 

/ e - s2 / 2 ds+ e~ s2 / 2 ds= e~ s2 / 2 ds = V2^ 

J a\ J —a\ J —oo 

and employ Remark 4.2. □ 

5. Sampling based on spatial correlations. The goal of this section is the 
proof of Theorem 1.4. To do this, we need to relate the spatial correlation 
function R to local probability asymptotics. For this, we use the following 
lemma. 

Lemma 5.1. Consider a Gaussian random field u:[a,b] x S7 — > M. satis- 
fying (Gl) and (G2). For x G [a, b) and sufficiently small values of 5 > 0, 
define the random vector T(6) = (Ti(S),T2(5),T^(S)) via 

(38) T 1 (S)=u(x), T 2 (S)=u(x + ^\ and T 3 (S) = u(x + 5). 

Then T is a centered Gaussian random variable with positive definite co- 
variance matrix C(5). Moreover, if we denote the eigenvalues of C(5) by 
< Xi(5) < X 2 (6) < A3 (J), then 
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X 3 (S)=3R w [x) + 0(S), 

where we use the notation introduced in (3), (4) and (5). In addition, we 
can choose the normalized eigenvectors vi(5), i> 2 (5) andv^{5) corresponding 
to these eigenvalues in such a way that 

1 ( l \ 1 

limui(<J) = —= -2 , limu 2 (<J) 




1 



lim-y 3 (5) = — F I 1 
5^0 >/3 \ i 

Finally, for a C 3 '-function fi : [a, b] — > R define the vector t(S) = (ti(5) , T2(5) , 
T3(<5))* mo 

(39) ri(5) = /x(z), r 2 (5) = /U^x + and r 3 (<5) = /x(rc + 5). 
TTien 

m ni l {x)^(x)-ni 2 {x) l ji\x) + n^{x)^{x) 

T ®-*® = 171^) ( } ' 

r(5) • , 2 («5) = fli,o(*)M(x) - iMsMs) . 5 + 0(5 2), 
V 2^0,0^) 



r 



(<5)-W3(a) = V3-/i(a;) + 0(<S). 



Proof. Due to our assumptions on u, the vector T(5) is normally dis- 
tributed with mean 6R 3 and covariance matrix C(5) € M 3x3 given by 

/ r(0,0) r(0,5/2) r(0,<5) 
C(5)= r(0,5/2) r(5/2,5/2) r(*/2,<5) 

\ r(0,<5) r(5/2,<J) r(5,J) 
where we use the abbreviation 

^(^1,^2) = R(x + 5i,x + 5 2 ). 
For (<5i , ^2) 0, the function r can be expanded as 

rOM 2 ) = i?o,o(x) + fli )0 (s)<5i + #1,0(^2 + ^M^ 2 + i?!,!^)^ 

. #2,0 (s) X 2 , #3,oQ) X 3 , #2,1 (g) , #2,1 (x) 2 

+ g 2 + — 6 1 2 1 2 2 
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+ 



6 



Sl + 0(\(5 1 ,5 2 )\ 4 ) 



where the Rk,i where defined in (3). Furthermore, (G2) implies that we have 
the strict inequalities 

i2o,o(a:) > 0, ^™ 3 (x) > as well as detK(x) > 0. 

These strict inequalities ensure that in all of the expansions derived below 
the leading order coefficients are positive. 

Using the above expansion of r, the determinant of the covariance ma- 
trix C{5) of the random vector T(5) can be written as 



that is, the covariance matrix is positive definite for sufficiently small 5 > 
0. Furthermore, by applying the Newton polygon method [26, 28] to the 
characteristic polynomial det(C(5) — XI) it can be shown that in the limit 
5 — > the three eigenvalues Afc(<5), for k = 1,2,3, of C(5) are given by the 
expansions in the formulation of Lemma 5.1. 

We now turn our attention to the asymptotic statements concerning the 
eigenvectors of the covariance matrix. According to the form of C(<5), we 
have 



where the limit has a double eigenvalue 0, as well as the simple eigen- 
value 3Ro y o(x) with normalized eigenvector (1, 1, l)*/3 1 / 2 . Due to standard 
results on the perturbation of simple eigenvalues and corresponding eigen- 
vectors [30], this implies that v^(5) can be chosen as in the formulation of 
the lemma. 

In order to determine the asymptotic behavior of the eigenvector corre- 
sponding to Ai, we consider the adjoint of the covariance matrix, whose 
expansion is given by 



The constant coefficient matrix has the double eigenvalue 0, as well as the 
positive eigenvalue 6 with associated unnormalized eigenvector (1,-2,1)*. 
Since the eigenspace for the largest eigenvalue of the adjoint matrix coin- 
cides with the eigenspace for the eigenvalue Ai(<5) of C(S), the simplicity of 
these eigenvalues shows that we can choose a normalized eigenvector v\(5) 
for \i(5) with vi(5) — > (1, — 2, l) i /6 1 / 2 for 5 — > 0. Finally, the orthogonality 



det C(8) = i • detTZ(x) ■ 5 b + 0(5 7 ), 
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of the three eigenvectors shows that we can choose a normalized eigenvec- 
tor v 2 (S) for X 2 (5) with u 2 ((5)^(l,0,-l)V2 1 / 2 for 6 0. 

We now turn our attention to the asymptotics of the inner products r(<5) • 
Vk(S). Since /i is a C 3 -function, we can write 




I 1 1+^ 1 -5+^1 1 U 2 + 0(<5 3 ), 





and this representation immediately furnishes t(5) -v 3 (5) — > 3 1 / 2 • fi(x) as 5 — > 
0. The statements concerning r(<5) ■ vi(5) and t(5) ■ t> 2 (<5) are more involved, 
and rely on expansions of the eigenvectors in terms of 5. 

As for the first eigenvector, write v\{5) = (vi ! i(5),vi t 2(S),vi t 3(5)) t , and 
consider the functions 

2 

wi,k(5) = 7= — -v 1>k (S) for k = 1,2,3. 

Then the vector wi(5) = (wi : i(5),wi t 2(S),wi j3 (5)) t is defined for sufficiently 
small 5 > 0, and for these 5 we have 

^2(5) = - 2 aswe n as (C(6)-\x(5)I)w 1 (5) = 0. 

Using the abbreviation C(5) = ((Hj{S))i,j=i,2,3, the latter system is equiva- 
lent to 



(ci,i(<5) - AiOS^nO*) + Cl ,3(%i, 3 (5) = ^ • c 1)2 (5), 



C3,i(%i,i(*) + (c3,s(<5) - Ai(<5))^i !3 (<5) = ^ • c 3 , 2 (<5), 
which immediately implies 

m 2 ( C3 , 3 (5) - Ai(5))ci i2 (<5) - C3, 2 (5) C1 ,3((5) 



^6 (ci,i(5) - Ai(<5))(c3, 3 («y) - Ai(<5)) - c 1>3 (S)c 3>1 (6) ' 
m = _L (ci.iW-Aiffl)c 3 ,2W-ci, 2 fflc3,iW 
Wl ' 31 j V6' (ci,i(<J)-Ai(<y))( C3 ,3(«5)-Ai((J))-ci,3(«5)c3,i(<5)- 
Expanding the right-hand sides now furnishes 

ft^s) 




24 ftgVx) 





•<5+ -^ + 0(<5 d 



with 



1 ^(x) 

^1,1,2 + ™1,3,2 = —7= • ^ m / v 
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This finally implies 

n^(x)ti(x)-n^ 2 (x)^(x) + n^ 3 (x)fi"(x) „o nrx3 . 
t(V- w ^ = * +°(U 

and together with 

^•M^ ^ -^^W and nm -^ ( ^ l 
2 <5-»o 2 

this establishes the asymptotic behavior of r(6) ■ v±(6). 

Finally, we turn our attention to the second eigenvector. Following our 
above approach, we write V2(5) = (v2,i(6), V2,2(6), f2,3(5))* > and consider the 
functions 

™2,k(6) = — —-V2,k(S) for k = 1,2, 3. 

Then the vector 11)2(6) = (1112,1(6), 11)2,2(6), 11)2,3 (6)Y is defined for sufficiently 
small 6 > 0, and for these 5 we have 

w 2 ,i(5) = ^= as well as (C(5) - A 2 (5)i> 2 (5) = 0. 

Using again the abbreviation C(<5) = (ci^^))^-^^, the latter system is 
equivalent to 

(c 2 , 2 (5) - M(S))W2,2(&) + 02,3 (5)^2,3 (5) = -775 ' C 2,l(5)> 

c 3 ,2(5)w; 2 ,2(5) + (03,3(8) - A 2 (5))w 2)3 (5) = -75 • c 3,i(5), 
which immediately implies 

1 (c 3 , 3 (5) - X 2 (8))c2,i(8) - c 2 ,3(8)c3,i (8) 



W 2 ,2(8) 



V2 (02,2(6) - X 2 (5))(c 3 , 3 (S) - X 2 (6)) - 02,3(6)03,2(6) ' 



m 1 (c 2 ,2(5)-A 2 ((5))c3 ) i(5)- 03,2(5)02,1(5) 

^2,3(5) = 7= 




V2 (02,2(6) - X 2 (6))(c 3 ,3(6) - X 2 (6)) - 02,3(6)03,2(6) 
Expanding the right-hand sides now furnishes 

I ° \ 

+ ^2,2,1 -6 + 0(6 2 ) 

V W 2 ,3,lJ 

with 

1 R lfi (x) 

w 2 ,2,i + w 2 ,3,i = —7= ■ o — rr- 
V2 #0,0 W 

This finally implies 

fxs fxs Ri,o(x)fi(x) - R 0fi (x)n'(x) x ^ nfx2 ^ 

t(6)-w 2 (6) = d + 0(d ) 

V2R ,o(x) 



28 



K. MISCHAIKOW AND T. WANNER 



and together with 

t{6)-v 2 {5) = V2v2i(S)-t(5)-w 2 (5) and lim y/2v 2 i (6) = 1 

5->0 

this establishes the asymptotic behavior of t(<5) • v 2 (5). □ 

After these preparations, we are finally in a position to prove our second 
main result. As mentioned in Section 1, this result provides a general means 
for determining the location of sampling points of random fields in such a 
way that the topology of the underlying nodal sets is correctly recognized 
with the largest probability. In addition, the sampling density can readily be 
determined from derivatives of the spatial correlation function of the random 
field. 

Proof OF Theorem 1.4. Due to our assumptions, the random variable 
u(x, -) : — >• K is normally distributed with mean and its variance Ro t o(x) 
is positive for each x € [a, b] due to (G2). This immediately implies (Al). 
Furthermore, (A2) follows readily from [1], Theorem 3.2.1. Thus, in order 
to apply Theorem 1.2 we only have to verify (A3). 

For this, we apply Corollary 4.3 with n = 3 and sign vector (si, s 2 , S3) = 
(1, —1, 1). Fix x £ [a, b) and consider the 5-dependent three-dimensional ran- 
dom vector T(5) defined in (38). Then according to Lemma 5.1, this random 
vector satisfies all of the assumptions of Proposition 4.1 and Corollary 4.3 
with 

detC(5) = -^-detK(x)-6 6 + 0(5 7 ) and Ai(<5) = • 5 4 + 0(<5 5 ) 



as well as 



64 v ' y ' w milf^x) 



Oil 



a 2 



as 



Ri,o(x)ti(x) - R 0fi (x)^(x) 
fi(x) 



#o,o(z) 1/2 ' 
Applying Corollary 4.3, we then obtain 



where we used the formula for given in (37). In combination with the 
above expansions for detC(5) and \i(5), this limit furnishes 

t *n , < g\ (l + «?)-e- (Qi+ ° i)/2 detTZ(x) .3 , n ,. 4 . 
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Thus, assumption (A3) is satisfied with Cq(x) = 3C(x)/4, and Theorem 1.4 
follows now immediately from Theorem 1.2. □ 

6. Concluding remarks. At first glance, the title of this paper may ap- 
pear somewhat misleading or more ambitious than the results delivered. Af- 
ter all, the techniques of proof are based on classical probabilistic arguments. 
However, the results are new and the examples of Section 2 demonstrate that 
they have interesting nonintuitive implications. 

A reasonable question is why were these results not discovered sooner. 
We believe that the answer comes from the fact that we are approaching 
the problem of optimal sampling from the point of view of trying to obtain 
topological information. This point of view had been taken previously in 
the work of Adler and Taylor [1, 2]. Their main focus, however, was the 
estimation of excursion probabilities, that is, the likelihood that a given 
random function exceeds a certain threshold. In [1, 2], it is shown that such 
excursion probabilities can be well-approximated by studying the geometry 
of random sub- or super-level sets of random fields. More precisely, it is 
shown that the expected value of the Euler characteristic of super-level sets 
approximates excursion probabilities for large values of the threshold, and 
that it is possible to derive explicit formulas for the expected values of the 
Euler characteristic and other intrinsic volumes of nodal domains of random 
fields. 

All of the above results concern the intrinsic volumes of the nodal domains — 
which are additive set functionals, and therefore computable via local con- 
siderations alone [20, 25]. In contrast, in previous work [14] we have demon- 
strated that the homological analysis of patterns of nodal sets can uncover 
phenomena that cannot be captured using for example only the Euler char- 
acteristic. The more detailed information on the geometry of patterns en- 
coded in homology is an inherently global quantity and cannot be computed 
through local considerations alone. On the other hand, recent computational 
advances allow for the fast computation of homological information based 
on discretized nodal domains. For this reason, we focus on the interface be- 
tween the discretization and the underlying nodal domain, rather than the 
homology of the nodal domain directly, and then quantify the likelihood of 
error in the probabilistic setting. In this sense, our approach complements 
the above-mentioned results on the geometry of random fields by Adler and 
Taylor [1, 2]. 

Given the current activity surrounding the ideas of using topological 
methods for data analysis and remote sensing [8, 9, 15], we believe the im- 
portance of this perspective will grow. Thus, the title of our paper is chosen 
in part to encourage the interested reader to consider the natural gener- 
alizations of this work to higher-dimensional domains where the question 
becomes one of optimizing the homology of the generalized nodal sets in 
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terms of homology computed using a complex derived from a nonuniform 
sampling of space. 
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